eOCmiBS7 BSSOIE 



n 78 OOa 307 

Ebely Robert L« 

State Testing Prograos: Statas« ProbleoSf and 
Prospects* 78 Bepost <»0* 

B6IC clear Ingbottse os Tests# Heasurenent, and 
Evalaationf Princeton, S*J« 

national Inst, of Sdacation (DBEH), Sashington, 
D.C. 

ETS-TB-ttO 
Dec 74 

OEC-0*70- 3797-519 

6p«; For related documents, see ED 086 721 and 087 
789 

HP-$0.75 HC-$1.50 PWfS POSTAGE 

criterion Beferenced Tests; «Edttcational Assessments 
♦Bdacational Testings Standardized Tests; *state 
Programs; State Surveys; *siirvey8; *Testing 
Programs 

Tailor Bade Tests 



The current status of state testing programs is 
assessed drawing primarily on information provided by the Educational 
Testing Service publication, "state Testing Programs, 1973 Bevision." 
Increases in state operated programs are indicated and are probably 
due to an increase in federal money for testing purposes. Because of 
possible confusion over the differences between a state testing 
program, a state assessment program, and a state testing service, 
some explanation is given as to the properties of each. A history of 
state testing programs is outlined, and new directions for such 
programs are proposed. Criterion- referenced and norm* referenced 
testing is contrasted, and the advantages and limitations of 
criterion-referenced tests are indicated. The problem of evaluating 
affective educational outcomes is explored and may be explained by 
the very limited role of noncognitive tests is state testing 
programs. The relation between the purposes of testing and the time 
of year the tests are given is discussed, and this timing is seen to 
affect the extent to which a particular purpose is served well or 
poorly. As to the type of test that should be given, standardized 
tests and tailor-made tests are compared, and their advantages and 
limitations are discussed. (SC) 



B0 099 429 

AOTBOR 
TITLE 

IBSTITOTION 

SPOHS A6EHCY 

BEPOBT BO 
POB DATE 
COBTBACT 
NOTE 



EDBS PRICE 
DESCBIPTOBS 



IDENTIFIERS 
ABSTRACT 



O 

ERIC 



t-me> . 



TM iUa>ORT 40 



ERIC CLEARINGHOUSE ON TESTS, MEASURSMENT, & EVALUATION 
EDUCATIONAL TESTING SERVICE, PRINCETON, NEW |Ei»EY ^540 



DECEMBBi 1974 



STATE TESTING PROGRAMSt STATUS, PROBLEMS, AND PROSPECTS 

Robert L.Ebd 



o 

o 
o 



TfaeCumnt Statm of State TestbigPr^^ms 

State prt*}»rani% of tesiting and asse^mwnf are prominent 
teatuivs ol the ci>nt<?nipi>fary educatuMiai scene. In a lev 
caM.*^, these pn»grani5i Mmply continue etfiMts to measure 
pupil achte\ementi». eHbrts that began decades iif, in one 
case, over a centur>' ago. In many more caties, they are 
recent inmnatfims, responses to increasing demands fcH* 
accountability^ or to needs for the evaluatHMi of innova- 
tiiins in education. 

Only 13 of the 50 states do not n^m have a statewide 
testing program. In three of these* plans are being 
developed for the inauguration of a teeing prc^am. 
Seven states in the midwest have prc^rams c^>erated by 
ag^cies of their state universities. These are supported 
mainly out of local school district budgets. While partici- 
pation in the programs is voluntary, many, or in some 
cases mmt, of the' schixils in the state take advantage of 
the testing services ottered by the universities^ 

In .^1 states^ testing programs are operated under the 
direction of the state department of education. Nineteen 
of these states repiHl thut their testing prc^ams are sub* 
stantially or ttKaily supported by the federal government 
frtim fumis available under Title III of the Elementary 
and Secondary Education Act. In the other 14 states* the 
necessary funds are pnn^ided by the state gcwernment. 

A Survey of State Tesrttag Pmgnmm 

These and a number of other interesting and useful facts 
are presented in Staw Testing Pr(^rams, 1973 Ri'visitm. 
a publication developed by Educati<mat Testing Serv^e 
(ETS) in collaboration with the Conference of Directors 
of State Testing Programs. The factual material 
presented in that report was obtained through telephone 
interviewii with the indhridual or iiuihriduals in each state 
who appeared most likely to provide detailed and ac- 
curate information. These state authcHities were told in 
advance w hat questions would be asked during the inter- 

Un me Mate. Kith the &tate department of educatton and a iitate 
URivmtty operate testing programs. 



view so that ti^y could be prepared to give accurate 
answers. 

aod Now bi &ate T^^^tPM^^aim 

The immediate fOTerunner of the 1973 publicati<Hi was a 
similar rep^ prepared by ETS in 1%8. A much earlier 
publication, intended to serve the same purpose, was 
Staw Ti'sdNg and Evaiuaikm Pn^tams by David Segel. 
published in 1^1 by the U.S. Oftice of Education. It is 
interesting to see how many states were otfering each 
kind of program then and now. 



Pnigram opemii'd hy 
State department 
State univershy 

No priYgram 



1951 1973 
17 



19 

iZ 

53^ 



31 

7 

SI' 



Note the sharp increase in the number of testing 
programs operated by siate departments of education. 
Part of this increase. |>crhaps most of it. is aimo^ cer- 
tainly due to the availability of federal funds (e.g. NOEA- 
19^, ESEA-1%5) for testing {Hirpmes* When thwe 
funds are no longer available* the number of state 
operated pnigrams may be cut drastically. 

Note also the sharp d^rease in the number of toting 
pn^ams operated by state unhrer^ies. These programs 
are usually voluntary since the universities have no direct 
control over local education authorities. Their costs are 
bom by the local dbtrkrt. They tend to emphasize 
guidance and instructional assistance father than asse^ 
ment of educational effect hrei^ss* As state legi^tures 
make increasingly heavy investments in local schod 
support, they become increasingly interred in the kiiKi 
and amount of education their dollars are buying. Tf^se 
may be some of the factors whkrh account for the 
replacement of voluntary, univmity-based testing pro- 

2|fi 1951. Hve states had dual t^ng prc^roms. om operated t>y thf? 

&tate department aitd the other the state university, 
^in H7J only ms state had both state departirtettt- and unhren^* 

operated programs. 



IT r This piihlication was prepared pursuant to a contract with the National Institute of Education, UJ. Department of Health, Educatkm 

Hand Welfare. C tinlniciors undertaking such promts muter government sponsorship are etK:oura£Ged to exprei^ frwiy their judgment in 
professi4>nai ami technical matters. Points of view or opinions do not, ti^refoie. repre^nt official National Institute of Education 
position or policy. 



grams by mamlaUn^ pn»grams operated out of state 
departments o| education. 

The figures Just presented and dtscu^ probably 
convey a reasonably accurate picture of the situation 
then and now with respect to state testing programs, 
Himwcr. despite the care taken by ETS to obtain ac- 
curate int'ormation. there may have been some ditfer- 
ences of i>pinion among respondents to the survey over 
the essential characteristics of a "state testing {m^m." 
l^jes the ferm apply only to pn^ams c^jerated under 
mandate of the state gov«imi«tt? h apply «j|y to 
pn^nts in which parttcipatton of all local education 
authwities is required? Does it apply only to pn^ams in 
which all examinees are require to take the same tests? 
Does it apply to tests given to evaluate the effectweness of 
educational programs? Is assessment tte same as 
testing? Quet^ions such as these sug^ that it may be 
useful to define, and to dbtinguish among the terms: 
1 Slate testing program 

2. State assessment pn^am 

3. .State testing service 

T^ii^, Assessm^ and Smice Pragrami 

A state testing program is available state'Wide but is 
limited to the schools in a particular state. It involves the 
use of a cfHnmon test or set of tests. Participation in the 
program may be mandatory or vduntary. Program co^ 
Tmy be borne by a stete a^ney or by the hxsi »;hooL The 
organization responsible for administration of the testing 
pr(^m provides standani directi<m$ f«- scheduling and 
administering tests. It also provides assi^ance in <*tain- 
ing. interpreting, and utilizing the test scores. 

A state assessment program is similar to a state teirting 
prc^ram and may, in some cases, be identtcai. That is. 
some testing programs may be call«l assessment pro- 
grams mainly to avoid the threatening or otherwise 
unpleasant connotations of the term "t^ing." But thei« 
may be some characteristic difierences. 

An assessment prc^am is likely to focus more on the 
cK'ectiveness of an educational pro^m than on the 
achievements of individual pupils. Participation in an 
assessment prc^am is more likely to be mandatory, and 
the costs are more likely to be borne by a state depart- 
ment. 

The testing Itself may involve matrix sampling. That 
is. both pupils and hems are sampled. Instep of asking 
all pupils to take the same long, comprehensive test, 
ditfcrent pupils take different, much shwter. s^s of test 
items. This means a good estimate of group achievement 
can be obtained in much less time of testing. 

Despite these differences, it seems reasonable to 
regard ? state assessment pn^am as one kind of state 
testing program. Surely it is useful to ijiclude data on 
such programs in a surv^ of (he kind recently made by 



ETS, and it was clearly the intention itf ETS to include 
them. But h is equally clear that at least twie state with an 
extensive {Mra^am of assessment did twt rep(»i it as a 
testing program. 

A state testing service is usually operated to assist tocal 
schools in (1) obtaining the tests they want to use, (2) 
scoring the tests, and (3) interpreting and utilizing the 
test resuhs. It t^jerates statewide but involves no limited 
or prescribed set of tests. Most of the costs of testing are 
borne by the Ic^al schod districts. In some states, an 
annual confo^nce is used as a kind of insmice toward 
more effective test utilization. 

Clearly, there are important differences between a 
state testing prc^m and a state teii^ing service. But 
again, it seems u^fUl to include reptHts on such services 
in any ^eral survey of state teeing pr<^ams. 

State testing programs have a Icmg history. One. insti- 
tuted by the Regents of the University of the State of New 
YM>k, b^an in 1865. Later, some sta^ began to ad- 
mini^er tests to certify satisfactoiy «}mpletion of the 
first eight grades. But the rapid expansion of state testii^ 
programs, begun after WorW War I. was influenced by 
at least two factors. One was a funeral concern for effi- 
ciency in business, industry, and indeed all enterprises, 
including educatitm. To deteimine efliciency <»ie mu^ 
measure results. The other factor was devdqnnent aw. 
refinement of tediniques fw measuring educational 
achievements and ps)«h(Hogical traits. This infiu«]ce 
began well befiM^ WorW War I, but it was given strong 
impetus by needs for te^ng in the personnel seksction 
and training programs of the militai? services. 

The testing programs initiate in the I92fh and 1930s 
flourished fw several decades. A few have continued to 
flourish. Others declii^ and were abandcHKid. One 
reason for this may \x the basic antipathy of sdiocA 
administrators to external evaluatifms. Another may be 
that local sch(K>is. having devdop^ their own special 
testing pn^ams, see less need f« participation in a 
uniform, external, state pn^am. Still another may be 
the view strongly held by some educators that schods 
should 4je mote concerned with a pupil's feelings, self- 
concept, and adjustment than with his knowl^ge. self- 
disciplint, and achievement. Those who endoi^ thte 
view regard tests and testing programs not as u^fbl 
educational tools but as c^acl^ to attainment of the 
goals they seek. Finally, it is p<»sible that some state 
testing programs languished and died simply because 
they were not ^x>d enough. b«;ause they dkl not seem to 
meet basic educational ne^s, as those needs were per- 
ceive by persons who contrdi^ the schools. 

Whatever the cause, the years since 1950 have wit- 
nessed a dw:line in the kind of state testing {HDgrams 
that flourished before 1^. But in the decade of the 



fiisitics* a sei oi lorces began to operate to create a 
new «et of state testing pfiigramn. 

One of these fiHves, pn>babl>* the nu^st powerful tme, has 
already been mentioned. It h the tticre^ng rede of state 
iegtslatures in prmkttng funds fin- toca! school opera- 
tU>ns« Their respimsibility to the electiH'ate is to see that 
theiM^ funds are ^^11 tipent. Hem^« they support testing 
programs that pnHiiise to prmide Si>me of tl^ evidence of 
educational iHttcomes they want. 

A sec4>nd fcH'ce is increasing skepticism concerning the 
ettectiveness of contempiH'ary schools. The alleged 
failure of inner city schoi>]s has been w^l! advertised. 
Innmativc programs like Head Start, designed and pro- 
moted by educat4>rs. have yields dtsappc»nttng results. 
I here is a widespread feeling that schools could do a 
K'tter job if ihey would i>nly try harder. Performan^* 
contracting and other strategies to make the sch(K>Is 
nuHV accimntahie have had considerabte appeal. The 
crucial ntle of testing in these strategies has lent support 
to the exiensfiHi of state testing programs. 

A third force contributing to the renaissance of state 
testing pr4>grams is the "new look" of testing. This goes 
bewnd subii^titution of the word *'assessment'* for •*test- 
ing** in the program designation. It g^iesb^ond a shift in 
the focus of attention from iiuJividual pupil achievemmt 
to curricular and instructiima! effectiveness, h involves 
mainly a somewhat ditierent approach to the measure* 
ment of achievement, an approach that has l^en 
designated by terms such as ''ccmtent-reterenced test* 
ing.'* •domain-referenced testing" or **criterton-ref- 
crenced testing." Two of the programs i^med in the 
recent ETS survey mention their use. or intended use* of 
criterion ^refereiKed tests. Others are no doubt also using 
them i>r consklering their use. 

Ci^rKm«R^ereiieed Testii^ 

Criterion -referenced testing is often contrasted ^ith 
niwm- referenced testing. The aim of the first mentioned 
is to determine how many, and which ones« of a specifiea 
set of instructional ob^ctives have been attained. Thus, 
the result of such a test may be a number, a percent* or a 
list of attainments. The aim of the norm-referenced test, 
im ^he other hand, is to tndkate how the attainments of a 
particular pupil compare with those of his peers. The 
results (raw scores) of norm -reference tests at^ usually 
ciinverted into percentiles, grade.^quivaients. or stan- 
dard scores. The meanings of ^ of these converted 
sci^res are essentially relative. 

Criteriim-referenced tests have some obvtoi's ad- 
vantages twcr norm-referenced tests. They can indicate 
direct!) what, and how much, the learner knows and can 
do. In tightly structure sequential learning, they can 



indicate when the student is rei^y to nuive ahead to ths 
next phase. And they help to avoid direct ^Hnparisot^ of 
one pupil's achievements with t^Ki^ of anther. Such 
comparisons, often made unfairly, have been the basis 
for st>me criticisms of nc^m-referenced tests* 

But there are also some p«^ble timitdti<^ of cri- 
terkHi-re&nxmced tests. Because ti^ sometimes fi^cus cm 
the attalnn^nt of a limited numi^ of separate, disct^e« 
highly specific objectives, they may induce teachers to 
neglect cultivati^ of more general capabilities (or 
dealing with oiher related but unsf^fi^ probtons* 
They may not ei^comp^iss adequately ti^ very targe 
numbw of interrelated concej^, f£^s« ideas* principles* 
meanings* understanding, and so on that constitute 
learning in many areas. Emphasis on discme sj^cifics 
may lead to neglect of tt^ int^ration of kleas that gives 
unity atKi solUlarity to a sub^. It may caun^ teachers 
atid students to s^k adequate performance of specified 
tasks through sheer tnonorization oT habit forming, at 
the expense of understanding. 

Another |KH^sibte limitation lies in the difficulty of 
specifying adequacy of {^formattce (mastery?) with 
respa?t to each objective. Learning is almost always a 
matter of degree. The statem'.^nt **Ymi either kno^' sinne- 
thing or you don't!" dots not describe accurately the 
acquisition of knowled^ in tnost areas of learning. Nor 
would a similar statement describe accurately the 
acquisition of an ability. This teaves tl^ asses!K)r with the 
questbn *'How much knowledge cm* ability is enough?" It 
is a que^ion that can seldom be answier^ on other than 
an arbitrary, conventional, ncn-clearty-rationalH)r-de* 
fensibie basis. 

Related to this is the difficulty of determining reliably 
whether a particular student has achieved p particular 
objective. In many criterbn-referenc^ tei^s, the attain- 
ment of each objective is tested by only a few items, 
stmietimes imly by one. Single test items* or very short 
tests, are notoriously unreliable. Asa consequence of this 
unreliability, a substantial number of the students tested 
may be judged wrcmgly to have attained, ift to have failed 
to attain, an adequate level of achievement with resp^ 
to a particular objective. 

With more widespread usa^, and more varied experi- 
ence, an answer may gradually emerge to the question 
the practical advanta]^ of criterion-reference 
tests outweigh their practical limitations?** Thus far they 
seem to have been used moi^t succ&^liy in teeing for 
acquisition of basic skills in the early elementary grades. 
Whether they can be us^ effectively at higher levels of 
education also remains to be seen. 

It may be worth mentioning that i^ativdy few well 
informed, penetrating analyses of the strengths and 
weaknesses of criterion-reference and of ncnm-ref* 
crenced tests have been published in eucational 
journals. Specialists in educational measurement seem 
much more concern^ with adapting the statistics of 



mH-iit rvlcn'm.tHi lasts to the w>nw«hat dHici^nt 
niatcrUils und purpt»sc!ko}'crticfhtfl*ivicr«tml tests. This 
may K* utKknitttndahk. but it may a\m be regrettable. 

Teeing Nottco^hh^ Eitecatto&al Outcomes 

It is jjeneraHy a|;reed that while schools seWimi succe^ 
briliiantiy in oehievittg their cognitive ^Is. they have 
distinctiy better success in the cognitive than in the 
att'ective domain. It is apparently much more ditticult to 
deline objectives. devek>p instructk>nAi programs, and 
evaluate aH'ective outcomes than Ciignitive tmes. 

These dirticulties may bw lar^ly responsible for the 
very limited n»le of nitnc^fgnltive tests in state testing 
pn»grams. In iMily nine states are noncognitive are^ 
tested. Tw{> states give noncognitive tests at both ele- 
mentary and secondary levels; four give them only to ele 
mentar\- school students, and three give them only to sec- 
t>ndary schiuM students. 

Nt>nct>gnitive areas m»»st trequentiy tested in ele- 
mentan,' schot>ls are attitudes toivard schcK^ and sdf- 
ciincept. Those most frei]uentiy tested in secondary 
schwils aw interests and attitudes tin»ard school. Thei« 
is a notable absence of any attempt to assess student 
values <apart from interests and attitudes) in any of the 
state testing programs. 

Are cognitive outcomes being overemphasized in 
school programs and in the testing programs d^gned to 
measure their effectiveness? Some critics of con- 
temporar>- education contend that they are. It seems to 
them that a i^rson's interests and values, his aspirations 
and attitudes, and his self-concept are crucially impor- 
tant in determining the quality of life he wilt live, and his 
success in living it. They conclude that schtwls should 
stress the attainment of nonct^nitive goals fully as much 
as ct»gnitive goals are now being stressed. 

Other educators are inclined to question that con- 
clusion. They do n<« minimize the importance of in- 
terests and aspirations, of attitudes and values. They do 
not object to the use of school time to consider cognitive 
aspects of these aH'ective manifestations. But they con- 
tend that the schtwl is not authorized, or equipped, to 
mold a student's values, his attitudes, his interests, or his 
aspirations tti fit some prescribed specifications. They 
believe that it is largely inapprc^riate for schods to 
define goals, design treatments, and assess outcomes in 
these areas. 

Legislati<»n is pending in at least one state (Michigan) 
that expressly ft»rbids schools or teachers from attempt- 
ing to "educate** their students affectively. . . by 
acting as a change agent of attitudes, values, ami 
religious or political beliefs of the pupils."'' There is no 
doubt that students Jo acquire affective responses in 
school, as they do at home and elsewhere. The question is 

'^HiHiw; Bin No. 5004 



whether or m»t schiKils shtHiid set 4»ut to teach certain 
atfeeti^'e resptwses purposefully and directly and whether 
tbey have any socially acceptable, mmct^itive means 
kv ddng so, apart from enfiMvement of codes <rf 
behavif>r sanctioned by the local community and the 
student body. At the moment, there appears to In? more 
general sup|Hm ft>r negative than n»r afiirmative answers 
to these questums. But here again, only time, and the 
gtKKl judgn»ent of well-informed educators, will tell. 

^ooid tl» Be Gives? 

In nu»st *»f the early state testing prt^ams. the tests w«re 
given in the spring at the end i>f the schixil year. This 
seemed a logical time to test for what had been learned. 
Then, partly in rcsf^se to cHticbms trf end-of-year 
testing. w>me programs moved to fall «■ to mklyear 
testing. Now, with renewed emi^sis on assessing the 
results of instmctional programs, there b some tendency 
to return to testing near the clo!w of the schiwl year. 

The recent RTS sur\'^ showt-ti that October was the 
numth most l.iequently used for testing, fdlowed by 
September. April, and May. Of courn. some programs 
oHer tests, ;llv of different kinds and for different 
IMirposes. mtM.' 'hen once a year. It is interesting to nrte 
two praams h hkh tests are administered duHnj? the 
usual vacation n; ths of July aiKl August. 

As su^ested above, thei* is a relation b^we«) the 
purposes for UuXhng and the time of year when the tesas 
are given. Tests intended for guidance, for identification 
of individual |miblems and talents. t>r ft»r placement and 
grouping are usually given in the fall. Those intended for 
evaluation of educational programs or instruction are 
most frequently given in the spring. Of course, any of 
these purposes can be served to some d^ree by tests 
given at any time of the year. But the timing of the test- 
ing dws affect the extent to which a particular purpose is 
served well or poorly. 

What Klni& <tf 1^ SboHld Be Given? 

The tests that actually are being us^ are m«inly tests of 
basic skills in reading, mathematics, and language: of 
basic understandings in natural science and social 
science; of aptitudes, and of study skills. Many of these 
are stamiardized tests, available from commercial test 
publishers. But tests that were tailor-made, or es|^cially 
revised, were used in a substantial minority of the states. 

Sfandaitlized tests have a number of advantages. One, 
of course, is ready availability. The director of a state 
testing prc^am can usually secure the tests and other 
needed materials sudi as ansH^r sheets, directicms for 
administration, manuals for score interpi^tation on rela- 
tivdy short notice. Once the structure of the program has 
been determined, the time required to get it into opera- 
tion need not be long. 



4 



But thi!^ read) avaitahUify can MimctisiHf^ be a dfcs- 
advatitagc. 'IVacher^ with pupiU lu be teirtiaJ may secure 
c«ipic^ III* thv ^c*t!i In advance and them in "teach- 
intj" <C4»achtn|{). Such practice!! !ierkni<v limit the 
%aiidit> ot the tc^itJi as mea^ure^ of real achievement and 
lead to grmsly unfair comparisom between Mrhtiols, 
Dianrtm of Mate testing programfi that make use ctf 
publsfihed WMs must take care to see that no advance 
inf^H-mation on the test to he uxd h r^ased. 

/V second valuable attribute of publish stat^ardized 
test* k their generally high quality. Usually the wn* 
V ruction of a standardized test is diluted by able, well 
trained, experienced te*t specialists. It is true, as l^tt^* 
mental measurement yeartn!. ks attest, that when these 
test specialists UH>k at tc^»t' ccHistrueted by other 
specialists, they can often pinnt ^lut shortccMninj^ ami 
^«ggt»st impnncments. But most of the widely ui^ 
pubitshed standardized tests are about as high in quality 
as the slate the art and the eeommiic constraints 4^' 
publishing allow. 

A third advantage of standardised tests is that 
national norms frequenth' acciMnpany them. These can 
stmietimes supply useful infiirmattim to supplement state 
and local nwms. indicating h4w the t^lucatiima! achieve* 
men Is i)f pupils in a particular schiHil or state compare 
w ith thmc of the nation as a whole. 

A fourth characteristic of standardised tests is perhaps 
nH»re commonly regarded as a disadvantage than as an 
advantage. The aspects of achievement covered in such 
tests are thi^se nutst generally regarded as important. If 
they aa» mn, the test, is not likelv to be w idely used. The 
aspects c^nered may not correspond closely with those 
thai are given greatest emphasis in a particular scho ol or 
system. But if a substantial discrepanc>' exists* par- 
ticularly in pn>grams designed to dcvek)p the basic skills 
or to cull ivy te understanding in basic areas i>f know l- 
edge, it may be the kKal pn^ram that is more to be 
questioned than the standard test. The latter prc^ably 
has been devcUfped on a broader basis of mwe expert 
judgment than has the local pn>gram. It is common for 
cimintitiees i^f expert teachers to have a hatwl in planning 
ami developing a standardized lest. 

While a ginni case can be made for experimental 
innin aliims in methods of leaching, it is much harder to 
make a strimg case for kH:al uniqueness in goals of in- 
struction in the cimimon bran(rhes (Gleaming. And even 
if a school diHJs have S4>mewhat unique ideas about what 
pupils shiiuld be learning, it prt>bab!y is a good idea to 
tirnt out h4>w this pn^ram is affecting the achievement of 
what others regard as impitrtant. 

The advantages and drawbacks of taikir-made tests are 
rcwghly the reverse of those of standardize tests. 
Periiaps the mmx appai^nt and important advantage of 



the tailor-made test ts that it can be de!»gned to if^ai^re 
the educaiiiitial iiutecnnes judged by tta? program jHilU^y 
authitHtics to be most essential In their i^nknilar situa* 
tkm. In siime cas««« this can tx an extremdy important 
iacttir. but as was {^ntiid *wt eariier. the w-wth of tocaily 
minvkit tfdutfatiiHiai ^vih m the cwntmtn brandtes of 
k'aming is open to srnne qttesticm. 

Another advantage oi* tl« tailor-made test is that tlw 
MNnirity of the test can be more easily fmHected. Misuse 
i>t the t^ in coacliing can be lar^y el^iaat^ 
ttt ctmrse. the same test is ui^ repeatedly. When test 
security is prtHected. the validity of the test scores, and of 
intcfttrottp ctwipartsons. can be maintained. 

A majt»r prc^tem In the use ttf taik^-made tests is 
Hnding go<^ taikirs. Few state departments of educatkm. 
and. indeed. all state unKersities. have statf memtwrs 
vthkmi taient. training, and expertenceadequatfM*- 'iHaiify 
I hem to do a ginnl Jc^j t»f test devdopment. 

C'iMitracting with an agency that sf^iaiizeti 
is pri>bably the bm sulutiiai to the pr(4>iem %. 
made test development. Ol^en sueh agencies have tiles of 
tested items from which appropriate selectkms can be 
made. But this si^iutkm brings prt^iiems <if its twn. 

There is the pnHtlem of ddining. and of ctmimunicat- 
ing fo the testing agency, exactly what the ctmtents and 
characteristics of the tests should be. There is the con- 
tractor's problem of meeting those s{^*ification$. And 
there is the dilHcult pn^lem fixing re^KHtsibility for 
the (luality of the product when It is the {»-oduct of jt^nt 
efforts. The test u^r is Hkdy not to be wholly satisfied 
with what the test praducer gives him to uiw. This is not 
to say that cooperative lest devekH>ment is unM^n-kable. 
But it is to say that the task is not as stmpte as it may 
appear at first glance. 

It folkms from this that the ast <rfa tailor-made tert is 
likely U» he higher than that of a published one. And. of 
ctHjrse. only state and local norms can ordinarily be 
developed for a tailor-m^e t^. 

Thus, it appears that neither standardb^ nor tailor- 
made tests provMe Ideal answers to the question of what 
kind of tests should be given. But since no other answer 
seems i«» be availaMe. one of the two, or a combination <rf 
biHh, may have to be chosen. It wouki be difficult to 
exaggerate the im^>rtance of test quality to tlw success 
of a staie testing pri^am. Other things---oppa$ition of 
schtH>l officials to "external" testing, lack of funds, or 
inept mana^ment— may cause the program to languish 
and fail. But without ap{m}{»iate tests of high qualhy it 
can have no hope of king run survh^al. 

Other Probtom <tf Stele TMtb^i Pn^psns 

It is interesting to note that test qualhy was not men- 
tioned as a major problem by dir^ors of any of the state 
testing prt^rams survej^ r«%ntly. The pitiblem 
reported most frequently (by II states) was funding. As 



ERIC 



mctitHitH»l earlier in thh arttcte« reductkm i^' ESEA Tttte 
ill fund!i may n^uce itr climtmite tl^ tenting pn^ms 
in Mime «tute«. 

Fund% fen* educatiiinai purpiHCs are «eidcmi suj^ied 
as generiHisly as nuHit edueatinn wimld like, in the years 
immediately aiw^. pubtk genenmity fm edtieatkmal 
tnirpiiseii ts likely tti be somewhat lesis than it %vas during 
the quarter centur>* (nm 19^ to 1^. But white of 
fundi may be cited as the rea!N:m demise of ^me 
teeing programs, the real rea^icMis uiH prc^bly ml be 
linnncial strhigenc>% Instei^ they be that eitlmr: 

1. The program did m4 yield etearly valkl. e^ly inter* 
pt^abte« data on the eHectiv^tess of edueatkmal 
etfonsi in the state, w that 

2. The public interest in obtaining such data was ncrt 
marshalled effectively enough to overcome tte cA- 
jectiiins of educators to external evaluations 

If state testing prc^ams disappear, it will net \^ 
primarily tiir lack of educaticmai fitnds« It will be due 
mi>re fumlamentally to defkiendes in the wisdom aiKi 
<tkilt of the test stH^cialists or to limitatkms in the vislm 
and strength of educational lemlers* 

A seciind prc^iem mentioned by sevml directors of 
state testing programs was 'use of resuhs/* presumab^ 
inadei|uatc or inappropriate use. No dmbt, examples of 
these detictencies could be cited. But tt^ ^ing that they 
CiHistiiute a major problem may be exag^rat^. 

The purpi^se of many state teeing programs is simply 
to prmtde information; information that can be u!^ as 
part of the basis fW d^sion making in the l^slature« 
the state depanment of education, the schocd boaid, or 
in the classroc^m: information that will be so used If it is 
relevant, reliable, and meaningfully repent^* These uses 
are likely to be numerous ai^ diverts. They are set in 
motion nm by the production of the test data but by the 
rcci>gnitiim of an educational problem. One thing that a 
state testing program can do to pnmu^e etf^hre u^ of 
results is to prepare a suggestive c«e book of appro- 
priate uses that have been made of tl^ rautlts. But far 
more important and basic than this is to provide mean- 
ingful reptirts of relevant, reliable results to appropriate 
educational decision makers. 



What of the future of ^ate miiu$ prt^ms? It is 
ditlicuh at this p«^t to {M^tet with any de^w of e^ 
tainty whether they will flourish m tai^bh. Clearly, 
there is a need for the khid of inftmnation c4! educatiimal 
etfeetiveness that stale tet^i^ fm^ms can provide. 
That tieed is likely to continue aikt to £^t>w. What k tnA 
dear is whether the leaders of statir testify im^ams wilt 
be able to suj^ly that Inlmtnat^ meani^felfy aiKt 
reliably and whether edueat^iat leat^ wiU toleni^ the 
immediate pain it inmietiii^ brhi^ lu^ a necessary pHce 
to {^kl for ^vamr^nent of the eMerpri:ie to whidi 
they as^ committed. An end to attacks on p^ttig af^ars 
unlikely in the fi»*seeable future. But the gro«i1ng fHiblic 
demaml that lun^iims abmit tte qualHy {MfeducaUcm in 
a school be backed by scrfkf evidence ghN» one ^^mls 
ti>r tu^ that th<»ie who siif^iort the asmsti^t of 
educational outromes n-sli prevail 



REFERENCES^ 

Se^K David. Sfau* wsiing and t^aiuatiwi pn^rams. 
W^hington, D.C: Unhed &atesOttice of Education, 
1951. 

Swv tesiing pn4!tvms: a sun^* ^\fi$n€tiQnis, tests, 
materiah ami services. Princrton. NJ.: Educatiimal 
Testing Service, im ED(^5% 

State educational assessment pn^rams: 1973 reHsitm. 
Princeton. NJ.: Educational Testing Service, 1973. 
ED(»0582 

State itiacatitmal assessment programs, Princeton, NJ.; 
Educational Testing Service. 1971. ED 056 102 

State testing pn^rams: 1973 revision. Princeton, NJ,: 
Educational Testing Servkre. ED<»7 7^ 

♦Items Mion^ by an HD number ifar exampk ED 069 762) 9tv 
ovftilabte htm the ERIC Dmtment Rqmidiictiiui %m\ke (EDRS). 
Omutt the infl« nfc^t Ustt^ of Resouftrs in EduoHkm fiw the 
^rtf«ft ami cutl^fing infartnatton. 



